-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(drand): add null HistoricalBeaconClient for old beacons #12830
base: master
Are you sure you want to change the base?
Conversation
Checked out this branch, and tried to see if I still encountered the #11802 issue. It got a bit longer then previous:
It then looped on the
|
OK, well I don't think the drand errors were your fatal problem here it's just looping in the background and logging those entries. But unfortunately we're stuck with the optimizing client: https://github.com/drand/drand/blob/v1.5.11/client/optimizing.go#L194, so perhaps we can't take this approach at all and need it fixed inside drand otherwise it's going to continue to speed test the null-client and spam those info log messages. I think your problem here is that you can't use a snapshot file when trying to import the full "chain". At least that's my understanding from reading the code - the main difference between importing a snapshot and importing the chain is that a chain import does a full tipset validation back to genesis, which is why it wants the beacons to be available. It calls I suppose a chain export is a |
Updated this branch with a modification that attempts to use the "watcher" mechanism in drand because watchers don't get speed tested, so if I can convince it that my historical null-client is a watcher then maybe it'll not speed test it. I haven't tested this but @rjan90 if you still have that snapshot handy, would you mind running it again to see if you get any of those INFO logs or any other errors from drand? |
Reran with the latest commit, and this is what I get:
|
e68c3e7
to
cf963a5
Compare
cf963a5
to
30ca496
Compare
OK, one tiny change and this seems to work; I grabbed my own snapshot and have been running it and the only logs are the badger ones so I believe this is solved, even if it is a hack. I'd still like @AnomalRoil to take a look and suggest alternative approaches. Marking @Kubuxu as reviewer since he's touched a lot of this stuff previously. |
Some feedback is in https://filecoinproject.slack.com/archives/C047Z15JNEA/p1738587544472029 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments.
@@ -113,6 +112,18 @@ func NewDrandBeacon(genesisTs, interval uint64, ps *pubsub.PubSub, config dtypes | |||
opts = append(opts, gclient.WithPubsub(ps)) | |||
} else { | |||
log.Info("drand beacon without pubsub") | |||
if len(clients) == 0 { | |||
// This hack is necessary to convince a drand beacon to start without any clients. For | |||
// historical becaons we need them to be able to verify old entries but we don't need to fetch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// historical becaons we need them to be able to verify old entries but we don't need to fetch | |
// historical beacons we need them to be able to verify old entries but we don't need to fetch |
// This hack is necessary to convince a drand beacon to start without any clients. For | ||
// historical becaons we need them to be able to verify old entries but we don't need to fetch | ||
// new ones. With pubsub enabled, it acts as a client so drand is happy, but if we don't have | ||
// pubsub then drand will complain about old beacons withotu clients. So we make one that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// pubsub then drand will complain about old beacons withotu clients. So we make one that | |
// pubsub then drand will complain about old beacons without clients. So we make one that |
var _ dclient.Client = historicalBeaconClient{} | ||
|
||
// historicalBeaconClient is a drand client that doesn't actually do anything. It's used when | ||
// we don't have a drand network to connect to but still need to provide a beacon client. | ||
// We don't expect calls through to the client to be made since we should only be verifying old | ||
// randomness, not fetching it. | ||
type historicalBeaconClient struct{} | ||
|
||
func (h historicalBeaconClient) Get(ctx context.Context, round uint64) (dclient.Result, error) { | ||
return nil, xerrors.Errorf("no historical randomness available") | ||
} | ||
|
||
func (h historicalBeaconClient) Watch(ctx context.Context) <-chan dclient.Result { | ||
return nil | ||
} | ||
|
||
func (h historicalBeaconClient) Info(ctx context.Context) (*dchain.Info, error) { | ||
return nil, xerrors.Errorf("no historical randomness available") | ||
} | ||
|
||
func (h historicalBeaconClient) RoundAt(time.Time) uint64 { | ||
return 0 | ||
} | ||
|
||
func (h historicalBeaconClient) Close() error { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very surprised the Empty
client isn't working for your needs?
opts = append(opts, dclient.WithWatcher(func(chainInfo *dchain.Info, cache dclient.Cache) (dclient.Watcher, error) { | ||
return historicalClient, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the trick is to have your client's Get
return a drand.ErrEmptyClientUnsupportedGet
error to avoid having to have a "valid" client in Wrap
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into this, it's possible we're not properly "supporting" that special error in the testSpeed
function.
I could change it if it'd help, so that we're not reporting the
oc.log.Infow("", "optimizing_client", "endpoint down when speed tested", ...
log line when it sees that error.
Fixes: #11802
Ref: #11802 (comment)
I'm pretty sure this will work since we only need to verify, not fetch, old beacon entries. So when we remove
Servers
from the config (like we've done for Mainnet and Incentinet), this will satisfy drand but also be a noop.